<!-- Copyright 2017 Capital One Services, LLC and Bitwise, Inc.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. -->
 
<!doctype html>
<html>
<head>
	<title>Remove Dups Properties</title>
	<link rel="stylesheet" type="text/css" href="../../css/style.css">
</head>
<body>

<p><span class="header-1">Remove Dups Properties</span></p>

<p><span><b>Properties</b>&nbsp;for the Remove Dups component can be viewed by Double click-&gt;component on canvas. The properties contain only the &#39;General&#39; tab since Remove Dups falls under the Straight Pull category in the component palette.</span></p>

<p><a name="general_properties"></a><span class="header-2">General Properties</span></p>

<p><img alt="" src="../../images/Removedups_Properties_General.png" /></p>

<p><span class="header-2">Display</span></p>

<ul>
	<li><span><b>Name</b> - The identifier for the component. This is a <b>mandatory</b> property. This property is pre-populated with the component name, i.e. 'Removedups' followed by an incremental number. It can be changed to any custom name. The name property has following restrictions:</span></li>
	<ul>
		<li><span>Must be specified and should not be blank.</span></li>
		<li><span>Must be unique across the job.</span></li>
		<li><span>Accepts only alphabets (a-z), numerals (0-9) and 4 special characters: "_", "-", ",", " " (space)<./span></li>
	</ul>
		<li><span><b>ID</b> - ID field will specify unique id for every component. </span></li>
	<li><span><b>Type</b> - Type defines the type of component within the category. This typically is the name of the component. This is a non editable field.</span></li>
</ul>

<p><span class="header-2">Configuration</span></p>

<ul>
	<li><span><b>Key Fields</b> - The key field grid accepts the key fields that will be used to determine the duplicate records in the input data. The component uses the key fields to group the data fields before performing the operation.</span></li>
	<p><img alt="" src="../../images/Remove_Dups_Key_Field.png" /></p>
	<li><span><b>Secondary Keys</b> - Secondary keys accepts the columns which will further groups the data fields. Here, user can specify the sort order which can be either Ascending or Descending. By default the sort order is Ascending(Asc).</span></li>
	<p><img alt="" src="../../images/Remove_Dups_Secondary_Keys.png" /></p>
	<li><span><b>Runtime Properties</b> -&nbsp;Runtime properties are used to override the Hadoop configurations specific to Remove Dups component at run time. User is required to enter the property name and value in the runtime properties grid.</span></li>
	<p><span>Check <a href="../../How To Steps/How_To_Pass_Hadoop_Properties_To_Component.html"> How to pass Hadoop properties to component</a></span></p></li>
	<p><img alt="" src="../../images/Runtime_Properties_Grid.png" /></p>
</ul>
<ul>
	<li><span><b>Retain</b> - Retain has three options - 
	<ul><b>First</b>  - Retains first occurrence of specified duplicate field and removes other duplicate records.</ul>
	<ul><b>Last</b>   - Retains last occurrence of specified duplicate field and removes other duplicate records.</ul>
	<ul><b>Unique</b> - Retains any unique record from duplicate records.</ul>
	</span></span></li>
	
</ul>
<ul>
	<li><span><b>Batch</b> - Batch simply accepts a numeric number starting from 0 to 99 and signifies the batch this component will execute in. By default this is 0.</span></li>
	
</ul>

<p><a name="validations"></a><span class="header-2">Validations</span></p>
	<p><span>The Remove_Dups components applies validations to the mandatory fields as described above. Upon placing the Remove_Dups component on job canvas for the first time (from component palette), the component shows up a warning icon as mandatory properties are not provided.</span></p>
	<img src="../../images/Remove_Dups_Warning.PNG" alt="Warning icon displayed on component" />
	<p>
<p><span>The properties window displays error icon on mandatory fields if it has an incorrect value. The error icon is displayed on the tab as well, if any of the field within the tab has some error.</span></p>
<img src="../../images/Remove_Dups_Validation.png" alt="Error icon displayed on tabs" />

<p><span>If the properties window has some error even after user visit's it once, then the error icon appears on RemoveDups component. This error icon is removed only when all the mandatory fields are supplied with correct values.</span></p>
<img src="../../images/Remove_Dups_Validation_Error.PNG" alt="Error icon displayed on component" />


</body>
</html>